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Gene overexpression beyond a permissible limit causes defects in cellular functions. However, the permissible limits of 
most genes are unclear. Previously, we developed a genetic method designated genetic tug-of-war (gTOW) to measure the 
copy number limit of overexpression of a target gene. In the current study, we applied gTOW to the analysis of all 
protein-coding genes in the budding yeast Saccharomyces cerevisiae. We showed that the yeast cellular system was robust 
against an increase in the copy number by up to 100 copies in >80% of the genes. After frameshift and segmentation 
analyses, we isolated 115 dosage-sensitive genes (DSGs) with copy number limits of 10 or less. DSGs contained a significant 
number of genes involved in cytoskeletal organization and intracellular transport. DSGs tended to be highly expressed and 
to encode protein complex members. We demonstrated that the protein burden caused the dosage sensitivity of highly 
expressed genes using a gTOW experiment in which the open reading frame was replaced with GFP. Dosage sensitivities of 
some DSGs were rescued by the simultaneous increase in the copy numbers of partner genes, indicating that stoichiometric 
imbalances among complexes cause dosage sensitivity. The results obtained in this study will provide basic knowledge 
about the physiology of chromosomal abnormalities and the evolution of chromosomal composition. 

[Supplemental material is available for this article.] 



Intracellular biochemical parameters, such as gene expression 
levels and protein activities, are highly optimized to maximize the 
performance of biological systems (Zaslaver et al. 2004; Dekel and 
Alon 2005; Wagner 2005). These parameters, however, have cer- 
tain permissive ranges to protect the function of the system against 
perturbations such as environmental changes, mutations, and 
noise in biochemical reactions. This robustness against fluctua- 
tions in parameters is considered a common design principle of 
biological systems (Alon et al. 1999; Little et al. 1999; von Dassow 
et al. 2000). When gene expression fluctuates beyond the robust- 
ness of cellular systems, various defects occur in the systems. How- 
ever, the differences in the expression limits of different genes and 
the factors influencing these differences are unclear. 

We previously developed the genetic tug-of-war (gTOW) 
method to measure the limit of gene overexpression (Moriya et al. 
2006, 2011, 2012). Using gTOW, we can assess the limit of gene 
overexpression as the copy number limit (CNL) of the target gene as 
follows. A target gene with its native regulatory sequences is cloned 
into a plasmid for gTOW. The plasmid carries a 2-micron origin, 
URA3, and LEU2 with a truncated promoter (leu2d). Yeast cells are 
transformed by the plasmid, and the transformants are first se- 
lected in medium lacking uracil (-Ura). The cells are then trans- 
ferred into medium lacking both uracil and leucine (-Leu-Ura). In 
this medium, leu2d becomes a selection bias to increase the plas- 
mid copy number in the cells because the cells with higher leu2d 
(plasmid) copy numbers grow faster. As the copy number increases, 
the copy number of the target gene also increases, and the gene 
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becomes proportionally overexpressed according to the increased 
copy number. If the gene has an overexpression limit at which 
cellular function is halted when the limit is crossed (i.e., inducing 
cellular death), then the plasmid copy number must be less than 
the limit, and the target gene becomes a selection bias to decrease 
the plasmid copy number. Biases arising from leu2d (that increases 
the plasmid copy number) and the target gene (that decreases the 
plasmid copy number) determine the plasmid copy number in the 
cells (thus, we designated this method "genetic tug-of-war"). Be- 
cause the bias to increase the plasmid copy number by leu2d is 
always the same, the copy number should be associated with the 
CNL of overexpression of the target gene. The plasmid copy 
number determined under the -Leu-Ura condition is considered 
the CNL of overexpression of the target gene if the copy number is 
significantly lower than that of the empty vector control (which 
is usually —100 copies per haploid genome). As the plasmid copy 
number and the cellular max growth rate under the -Leu— Ura 
condition are correlated with each other, max growth rate can also 
be an indicator of the CNL of the target gene. Ideally, in gTOW, the 
protein level expressed from the target gene increases according to 
the copy number increase. However, if the transcription factors for 
the target gene are diluted or if there is feedback in expression 
regulation, then the copy number increase might not be linearly 
reflected in the protein level. In this study, we thus designated it on 
the basis of the overexpression limit measured by gTOW as the 
"CNL of overexpression" to distinguish the limit of protein over- 
expression. We previously determined the CNLs of cell cycle reg- 
ulatory genes in the budding yeast and fission yeast and found that 
their CNLs were diverse, ranging from less than two to more than 
100 (Moriya et al. 2006, 2011). 

Several genome-wide analyses revealed the genes that cause 
cellular dysfunction upon overexpression (Gelperin et al. 2005; 
Sopko et al. 2006). These analyses were performed using promoter 
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swapping in which each target open reading frame (ORF)/protein 
is highly expressed by the strong inducible GAL1 promoter. The 
results obtained by promoter swapping and gTOW are known to be 
different (Moriya et al. 2006; Krantz et al. 2009) because the former 
technique causes absolute overexpression and the latter causes 
relative overexpression from the native level. The promoter swap- 
ping approach is useful for determining what happens when 
a target protein abundantly exists within the cell. Conversely, it is 
difficult to argue how much the target is overexpressed when cel- 
lular dysfunction is observed. As gTOW increases the copy number 
of the target gene with its native promoter, this argument is pos- 
sible. We thus consider that gTOW is a useful method for evalu- 
ating the robustness of cellular systems by assessing how much 
gene expression is fluctuated from the native level when the sys- 
tem halts (Moriya et al. 2012). The advantage of gTOW is that one 
cannot only isolate genes causing cellular dysfunctions upon 
overexpression but also quantitate the limits of gene overexpres- 
sion that are associated with cellular robustness. In addition, we 
consider that gTOW is useful for evaluating cellular dysfunction 
triggered by the fluctuation of the gene copy number. 

In this study, we performed a genome-wide CNL measure- 
ment of genes of the budding yeast Saccharomyces cerevisiae using 
gTOW to reveal the profile of CNLs of all genes in this organism 
and determine why the yeast cellular systems are sensitive to mi- 
nor increases in the copy numbers of those genes. First, we isolated 
786 genes with significantly low CNLs. Further, we isolated genes 
with extremely low CNLs (10 or fewer copies per haploid genome), 
which we designated "yeast dosage-sensitive genes" (DSGs). Our 
results indicated that the yeast cellular system was robust against 
copy number variations (overexpression) in most genes but fragile 
against variations in a specific set of genes. Yeast DSGs tended to 



encode protein complex components, as well as proteins involved 
in cytoskeletal organization and intracellular transport. Our ex- 
perimental evidence suggested that protein burden and stoichio- 
metric imbalance are the primary causes of dosage sensitivity. 
These findings may have an interesting evolutionary implication 
in that DSGs function to constrain and secure the integrity of eu- 
karyotic genomes during evolution. 

Results 

gTOW6000: Analysis of all protein-coding genes in S. cerevisiae 
using gTOW 

To analyze all protein-coding genes in the S. cerevisiae genome 
using gTOW, we performed a series of experiments as summarized 
in Figure 1 (for details, see the Methods). We amplified all protein- 
coding genes (5806) with their native regulatory regions in the 
yeast strain BY4741 chromosome using polymerase chain reaction 
(PCR) and then cloned the genes into pTOWug2-836 (Supple- 
mental Fig. SI; Moriya et al. 2012). Because not all promoter re- 
gions were identified, we cloned genes with their upstream and 
downstream sequences up to their neighboring genes (as an ex- 
ample, see Supplemental Fig. S2A,B). Cells harboring the gTOW 
plasmids with each target gene were cultivated in -Ura and 
— Leu— Ura media. We then measured max growth rate under the 
-Leu -Ura condition using online monitoring of cellular growth, 
and the plasmid copy numbers under the -Ura and -Leu-Ura 
conditions using quantitative PCR. We analyzed at least two in- 
dependent plasmid clones for each gene. The reproducibility be- 
tween each duplicate is shown in Supplemental Figure S3. To this 
point, we have succeeded in analyzing >95% of the genes in the 
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Figure 1. Scheme of genome-wide analysis of protein-coding genes in 5. cerevisiae with gTOW (gTOW6000). Each step of gTOW6000 is shown. For 
steps 2-7, the representative data of plate no. 1 3 are given as an example. The details of each step are described in Methods. 
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yeast genome (the entire data can be found in Supplemental Table 
SI). Hereafter, we will refer to this analysis as "gTOW6000." 

Figure 2 shows the copy number under the -Leu-Ura con- 
dition determined in gTOW6000. gTOW6000 was performed us- 
ing 96-well microplates. We handled 244 plates, as we analyzed 
two clones under two culture conditions for each gene. For the 
purpose of data quality control and to obtain a negative control, 
several empty vector experiments were performed for each plate 
(a total of 230 measurements) (Supplemental Table S2). The aver- 
age of the empty vector experiments is shown as the orange line 
in Figure 2. To identify genes with significantly lower limits than 
the empty vector control, we evaluated the copy number data 
under the -Leu-Ura condition using Student's t-test. In total, 
919 genes had P- values <0.05, and 786 of them had lower copy 
numbers than the vector average (genes surrounded by a blue- 
dotted rectangle in Fig. 2). We thus considered the copy numbers 
of these genes under the -Leu-Ura condition to be their CNLs of 
overexpression. The average copy number of these genes was less 
than 85. This finding conversely indicates that the other 5000 
genes have similar or higher CNLs than the detectable CNL in 
gTOW using pTOWug2-836, and suggests that the yeast cellular 
system is generally robust against a nearly 100-fold increase in the 
copy number of any one of 80% of its genes. Although some 
genes displayed much higher limits than the vector average, there 
was no reproducibility between the two clones (Pearson's corre- 
lation coefficient between the duplicates of genes with average 
copy numbers of >250 was -0.26). We thus concluded that the 
findings were reflective of experimental errors. 

In gTOW, there should be a correlation between the CNLs and 
max growth rates of low limit genes (Moriya et al. 2006, 2012). In 
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Figure 2. Copy number limits (CNLs) of 5. cerevisiae genes determined by gTOW analysis. Genes 
were ordered according to their average copy number determined by gTOW under the -Leu-Ura 
condition. Each gene has two data points because of the duplication of the experiment. The orange line 
and the transparent zone around the line indicate the average copy number with the empty vector and 
the standard deviation, respectively. Genes that showed significantly lower limits than those observed in 
the vector experiments (786 genes, P < 0.05) are surrounded by the blue dotted rectangle. Genes with 
CNLs of 10 and less (dosage-sensitive genes [DSGs]) are surrounded by the red-dotted rectangle. A 
confident set of DSGs isolated after frameshift and segmentation analyses (Fig. 4) is shown. The entire 
data set is given in Supplemental Table SI . 



addition, there should be a correlation between the copy numbers 
under the -Ura and -Leu-Ura conditions (Moriya et al. 2006). 
These expectations were confirmed in gTOW6000 (Supplemental 
Table S3). We next calculated the copy number causing 50% 
growth inhibition in gTOW6000. To reduce the effect of experi- 
mental errors, we first calculated the moving averages of max 
growth rates and CNLs for 100 of the 786 genes with significantly 
low CNLs (Supplemental Fig. S4A). To approximate the relation- 
ship between CNL and max growth rate (Supplemental Fig. S4B), 
we derived a first dimension equation as follows: CNL = 49.24 X 
[max growth rate] (R 2 = 0.98). From the equation, the copy number 
that gave 50% growth inhibition (max growth rate = 1.11) was 
calculated to be 54.7 copies. If the target gene has a very low limit, 
then the cells expressing the gTOW plasmid cannot grow under 
the -Leu-Ura condition because they cannot produce sufficient 
amounts of leucine (Moriya et al. 2006). We next evaluated the 
lower limit copy number resulting in no growth in gTOW6000. We 
calculated the moving averages of max growth rates as described 
previously in this section. For each bin, we then counted the 
number of genes displaying no growth (max growth rate is set as 
0.1; see Methods) in both of the duplicated experiments (i.e., fre- 
quency of no-growth) (Supplemental Fig. S5A). To approximate the 
relationship between frequency of no-growth and CNL (Supple- 
mental Fig. S5B), we derived the following equation: [frequency of 
no-growth] = -0.0002 X CNL 3 + 0.0476 X CNL 2 - 3.6046 X CNL + 
101.53 (R 2 = 0.996). We used this equation to calculate that a gene 
with a CNL of 18.4 could not grow in 50% of cases in the gTOW 
experiment. 

By use of genome-wide screening, Sopko et al. (2006) pre- 
viously isolated 767 S. cerevisiae genes that caused cellular growth 
defects when overexpressed by the GAL1 
promoter. As we isolated a similar number 
of genes with low CNLs (786 genes), we 
compared two data sets. As shown in 
Figure 3 A, only 161 of the 786 genes iso- 
lated by gTOW6000 overlapped with 
those in the study by Sopko et al. (2006), 
although the overlap was significant (P < 
1.5 X 10" 8 , chi-square test). The differ- 
ence possibly arose from the difference 
in the experimental systems for over- 
expressing genes, as is discussed in the 
Introduction. The difference was signifi- 
cant when we separated isolated genes 
by their native expression levels (Fig. 
3B). Highly expressed genes were signif- 
icantly isolated as genes with low CNLs 
in gTOW6000 (P = 1.322 X 10" 15 in the 
Mann-Whitney l/-test), whereas this 
finding was not replicated in the study 
by Sopko et al. (2006) (P = 0.7378 in the 
Mann-Whitney U-test). Another differ- 
ence between the two experiments was 
the proportions of protein complex mem- 
bers. The 786 genes isolated by gTOW 
contained significant numbers of protein 
complex members (Table 1), whereas the 
767 genes isolated by Sopko et al. (2006) 
did not contain many protein complex 
members (Table 1). This might reflect the 
fact that protein complex members tend 
to be highly expressed (Supplemental 
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Figure 3. Comparison of gTOW6000 data with data of another overexpression analysis performed 
using promoter swapping. (A) Overlap of genes identified by the overexpression analyses performed by 
Sopko et al. (2006) and in this study. (B) Distribution of genes identified by overexpression analysis 
ordered by their native protein levels. Each bin contains genes ordered by their native protein levels 
(Ghaemmaghami et al. 2003). The protein abundance unit is molecules per cell. Error bars, SEM. 



Fig. S6). From these results, we considered that gTOW6000 would 
provide additional clues to understand the cellular effects of gene 
overexpression, as this method isolated a different subset of genes 
from previous promoter swapping experiments. Of the 161 over- 
lapped genes (Fig. 3A), the highly expressed genes among the 786 
gTOW6000 genes were excluded (Fig. 3B), and the complex mem- 
bers of 767 genes isolated by Sopko et al. (2006) were enriched 
(Table 1), probably due to the characteristics of the opposite data 
sets. 



Isolation of low limit genes (yeast DSGs) 

To further understand the characteristics of low limit genes, we 
performed additional experiments to isolate a confident set of 
genes with CNLs of 10 or less. We introduced a frameshift mu- 
tation in each of the 182 genes to confirm whether the expression 
of the protein but not that of the DNA and RNA elements de- 
termined the limit (Fig. 4A). Frameshift analysis could also deter- 
mine whether either of the bidirectionally overlapped genes was 



the cause of the low CNL (for example, 
see Supplemental Fig. S7A). Among the 
155 genes with CNLs of 20 or less, the 
frameshift mutants of 140 of these genes 
displayed more than fivefold higher 
CNLs than the wild-type genes or their 
CNLs increased to the vector level (—100 
copies) (Fig. 4B; Supplemental Table S4). 
We thus verified that the original target 
ORFs of these 140 genes determined the 
CNLs (denoted as "fs verified" in Supple- 
mental Tables SI, S4). 

We further analyzed the 15 genes in 
frameshift mutants that did not exhibit 
increased limits (12 of them are indicated 
by red circles in Fig. 4B). They were cate- 
gorized as four different types of genes as 
follows. (1) One of the overlapping ORFs 
appeared to cause the low limits. The 
cloned regions contained two overlap- 
ping ORFs in cases of YFL010C/WWM1- 
YFL010W-A/AUA1 and YGL167C/PMR1-YGL168W/HUR1 . Be- 
cause the frameshift mutants of WWM1 and PMR1 displayed in- 
creased CNLs, we concluded that these genes were responsible for 
the low CNLs. The result for YGL167C is shown in Supplemental 
Figure S7A as an example. (2) Because both clones containing one 
of the two neighboring genes (YNL024C-A/KSH1-YNL025C/SSN8) 
exhibited low CNLs but the frameshift mutations did not in- 
crease the CNL of either gene (Supplemental Fig. S7B), we con- 
cluded that an RNA gene (NME1) caused the low limits. (3) For 
genes for which the frameshift mutations did not increase their 
CNLs but the cause could not be ascertained from their genome 
annotations, we segmented the fragments into 5' UTR and ORF-3' 
UTR fragments and measured their limits (Fig. 4A). Both the 5' 
and 3' segmented fragments of CPS1, FHL1, GRX3, HOM3, TPK1, 
and TPK3 (underlined in blue in Fig. 4C) displayed increased 
copy numbers. These ORFs may have been expressed from ATGs 
other than the annotated ones. (4) The segmented fragments 
(ORF-3' UTR) of ASE1, DIE2, IRC8, and SFP1 did not exhibit in- 
creased CNLs (underlined in red, Fig. 4C). For DIE2 and IRC8, we 



Table 1. Characteristics of DSGs 



Protein complex 
members 3 



69.6% (80/1 1 5) 

9.05 X 10 7 
61 .5% (483/786) 

<2.2 X 10~ 16 
62.7% (101/161) 

4.06 X 10 5 
46.0% (353/767) 



Yeast DSG f 

(limit <10) 
P- value 

gTOW6000 786 

genes 
P- value 

Overlapped 1 61 

genes 
P- value 

Sopko 767 genes 

P-value 

All genes 



46.5% (2690/5783) 
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75.7% (87/1 15) 

7.80 X 10 10 
60.3% (474/786) 

9.37 X 10~ 15 
64.6% (1 04/161) 

1.37 X 10 5 
57.4o/ 0 (440/767) 

3.92 X 10~ 9 
47.4% (2742/5783) 



Genes with no. 
of PPIs _5 b 



36.5% (42/115) 

1. X 10 10 
25.7% (202/786) 

<2.2 X 10~ 16 
29.8% (48/161) 

1.40 X 10 7 
21.1% (162/767) 

3.08 X 10" 7 
14.9% (863/5783) 



Intrinsic protein 
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23.5% (27/115) 

3.43 X 10 7 
24.8% (1 95/786) 

<2.2 X 10~ 16 
32.3% (52/161) 

4.91 X 10 12 
26.2% (201/767) 
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3.75 X 10 7 
21 .8% 067/767) 
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26.1% (30/115) 



27.4% (215/786) 

9.90 X 10~ 8 
28.0% (45/161) 

0.01705 
20.9% (160/767) 

20.2% (1168/5783) 



a Protein complex components (mips; ftp://ftpmips.gsf.de/yeast/catalogues/complexcat/complexcat_data_1 8052006). 
b Protein-protein interactions (dip; http://dip.doe-mbi.ucla.edu). 
"Intrinsic protein disorder (Vavouri et al. 2009). 
d Yeast ohnolog (http://wolfe.gen.tcd.ie/ygob/). 

e Essential genes (http://www-deletion.stanford.edu/YDPM/YDPM_index.html). 
f Complete data set for yeast DSGs is given in Supplemental Table S5. 
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Figure 4. Frameshift and segmentation analyses of candidate low limit genes. (A) Structure of the 
plasmid used in frameshift analysis and segmentation analysis. (Red letters) The nucleotide inserted to 
generate frameshift. The introduced Fspl site in the mutant is underlined. (B) A scatter plot of the CNLs of 
the wild-type genes and the frameshift mutants of low limit genes. (Black circles) Genes that displayed 
increased CNLs when frameshift was introduced. (Red circles) Genes that did not display increased CNLs 
even when frameshift was generated. Note that the frameshift mutants ofAUAl, GAT1, and FHL1 could 
not be obtained, probably because their frameshift mutants also have very low limits. The raw data can 
be found in Supplemental Table S4. (C) CNLs of segmented genes. Genes underlined with a blue line 
are those that displayed increased CNLs upon segmentation. Genes underlined with a red line indicate 
genes that did not display increased CNLs upon segmentation. 



performed additional segmentation analysis (Supplemental Fig. 
S8). The 3' regions of both genes had elements causing the low 
limits, although their functions are still unknown (Supplemental 
Fig.SS). 

By use of the aforementioned analysis, we isolated 115 DSGs 
by removing the overlapping genes (AUA1 and HUR1), the RNA 
gene (NME1), the genes for which their low limits were not caused 
by their annotated ORFs (DIE2 and IRC8), and a real-time PCR 
reference gene (LEU3) from the list of genes with CNLs of 10 or 
less (Fig. 2; Supplemental Table S5). Among the yeast DSGs, 88 
genes were previously isolated in screenings of genes causing 



toxicity upon overexpression by promoter 
swapping (Liu et al. 1992; Espinet et al. 
1995; Akada et al. 1997; Stevenson 
et al. 2001; Boyer et al. 2004; Gelperin 
et al. 2005; Sopko et al. 2006; Niu et al. 
2008; Yoshikawa et al. 2011). According to 
the Saccharomyces Genome Database (SGD; 
http://www.yeastgenome.org), the over- 
expression of —1900 genes was reported to 
cause lethality or decreased cell growth. 
This study isolated another set of genes 
causing growth defects after only a minor 
increase in copy number (overexpression 
relative to the native level). Jones et al. 
(2008) created a comprehensive overlap 
DNA library of the S. cerevisiae genome 
using a 2-micron-based multicopy vec- 
tor. They tested the toxicity of each clone 
to yeast cells and identified 23 toxic DNA 
segments. We can assume that the yeast 
DSGs isolated in our study are responsible 
for the toxicity of the DNA segments. In 
total, 12 of the 23 toxic clones actually 
contained DSGs isolated in this study 
(Supplemental Table S6). At present, it is 
unclear why clones without yeast DSGs 
are toxic. The toxicities of these clones 
might be explained by the additive effect 
of weak DSGs within the same clone, or 
we may have failed to clone the promoters 
of target genes that were present beyond 
the neighboring genes. 

We next analyzed the characteristics 
of isolated DSGs (Table 1). DSGs signifi- 
cantly contain protein complex members, 
proteins with many interaction partners, 
and proteins containing higher intrinsic 
disordered regions. Although it was not 
significant, the percentage of essential 
genes among yeast DSGs was higher than 
that within the entire genome. DSGs also 
tended to be highly expressed (P = 4.696 X 
10" 6 in the Mann-Whitney [/-test) (Sup- 
plemental Fig. S9), as did the 786 low limit 
genes (Fig. 3B). Yeast DSGs contain sig- 
nificantly higher percentages of genes in 
the gene ontology categories of cytoskel- 
etal organization and intracellular trans- 
port (Table 2), whereas transcription fac- 
tors and signaling molecules (protein 
kinase and phosphatase) were not con- 
centrated (data not shown). Figure 5 presents a gene network 
constituted according to the functional category of each gene and 
their physical (protein-protein and protein-DNA) interactions that 
were described in SGD. 



Protein burden causes dosage sensitivity 

The fact that DSGs tended to be highly expressed suggests that the 
increased copy number of a highly expressed gene exerts a burden 
on protein turnover (Stoebel et al. 2008; Sheltzer and Amon 201 1), 
which causes the dosage sensitivities of yeast DSGs. We thus se- 
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Table 2. Gene Ontology analysis of yeast DSGs 





Gene Ontology identification: term 


Observation 


Mean 


SD 


Z- sco re 


P-value 


• 

Biological process 


0006810: Transport 


41 


25.1 


4.2 


3.8 


2 

1 .37 x 1 0 




0016044: Cellular membrane organization 


17 


6.8 


2.4 


4.2 


1.38 x 10~ 2 




0007049: Cell cycle 


25 


12.8 


3.4 


3.6 


2.19 x 1(T 2 




0016192: Vesicle-mediated transport 


20 


9.2 


2.8 


3.9 


2.25 X 10 2 


Molecular function 


N.A. 












Cellular component 


0005856: Cytoskeleton 


19 


4.9 


2.2 


6.3 


5.49 X 10~ 6 




0005938: Cell cortex 


12 


3.4 


1.7 


4.9 


2.55 X 1(T 3 




0005624: Membrane fraction 


13 


4.6 


2 


4.3 


1.24 X 10 2 




0030427: Site of polarized growth 


14 


5.4 


2.2 


3.9 


1.52 X 10~ 2 




0016023: Cytoplasmic membrane-bounded vesicle 


9 


2.7 


1.5 


4.3 


3.57 X 1(T 2 




0005815: Microtubule organizing center 


7 


1.7 


1.1 


4.8 


3.67 X 10~ 2 



Complete data set is given in Supplemental Table S5. 



lected six highly expressed genes (Partow et al. 2010) and replaced 
each ORF with the green fluorescent protein (GFP) (Fig. 6A; 
Cormack et al. 1997). TEF1 and TDH3 were the DSGs isolated in 
this study. If the overproduction of an unnecessary protein, but 
not the specific function of the protein, determines the limit of 
a gene, then the copy number of the artificial gene should also be 
limited. As shown in Figure 6B, five out of six GFP constructs 
exhibited significantly lower limits compared with the vector 
control (P < 0.05, Student's t-test); moreover, the CNLs (the copy 



numbers under the -Leu-Ura condition) of native and GFP 
replaced genes were highly correlated (Pearson's correlation = 0.90) 
(Fig. 6C). In addition, acceleration of GFP degradation by adding 
a degradation signal (Fig. 6 A; Jungbluth et al. 2010) further reduced 
the CNLs (Fig. 6B) and increased the correlation (Pearson's corre- 
lation = 0.94) (Fig. 6D), indicating that the accumulated GFP itself 
does not cause gene toxicity. These observations suggest that 
a minor increase in the copy number of highly expressed genes 
causes a protein turnover burden that leads to dosage sensitivity. 
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Figure 5. Molecular interactions between DSGs. Yeast DSGs were colored according to their functional category annotated in the Saccharomyces 
Genome Database (SGD). Genes were connected by their protein-protein interactions (solid lines), functional relationships (dotted lines), and protein- 
DNA interactions (thin lines). The interaction data were obtained from BioGRID (http://thebiogrid.org/). White-colored genes and bold lines denote 
the candidate partners and their interactions experimentally tested by 2D-gTOW, respectively (Fig. 7; Supplemental Figs. S11, SI 2; Table 3; Sup- 
plemental Table S7). The network was created using Cytoscape 2.8.1 (http://www.cytoscape.org/) and modified using Illustrator CS5 (Adobe) and 
PowerPoint 201 1 (Microsoft). 
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Figure 6. Protein burden causes dosage sensitivity. (A) Plasmid constructs to examine the protein burden. TEF1 is shown as an example of highly 
expressed target genes. We constructed these artificial genes using pTOW40836, introduced the plasmids into yeast strain BY4741, and then measured 
the upper CNLs and the maximal GFP fluorescence. ODC degron indicates the degron from the mouse ornithine decarboxylase gene (Jungbluth et al. 
201 0). (6) CNLs of native and GFP replaced genes. The gene names on the horizontal axis indicate that their ORFs were replaced by GFP, as shown in A. (C) 
Comparison of the copy numbers of native- and GFP-replaced genes. (D) Comparison of the copy numbers of native- and GFPdeg-replaced genes. 



If the protein expressed from the gene is unstable, then the dosage 
sensitivity could be accelerated because of the increased protein 
turnover burden. 

Dosage imbalance causes dosage sensitivity 

Although protein burden causes the dosage sensitivities of some 
DSGs as demonstrated in this study, it is apparently not the only 
mechanism to explain the dosage sensitivities of all yeast DSGs, 
because the upper limit of native TEF1, e.g., was far lower than that 
of the GFP construct (Fig. 6B), and some yeast DSGs encoded lowly 
expressed proteins (Supplemental Fig. S10). As indicated above, 
protein complex components were highly concentrated among 
yeast DSGs (Table 1). It is thus possible that stoichiometric im- 
balance (Papp et al. 2003; Torres et al. 2007; Veitia and Birchler 
2010) is another mechanism leading to the dosage sensitivities of 
yeast DSGs. Ohnologs are genes created by ancient whole-genome 
duplication events and are retained in the genome. Previous 
studies and we proposed that they are dosage balanced (Veitia et al. 
2008; Makino and McLysaght 2010). Thus, we compared the yeast 
DSGs and ohnologs and found that they overlapped significantly 
(Table 1; Supplemental Table S5). This also supports the idea that 
dosage imbalance causes the dosage sensitivity of DSGs. In fact, we 
previously demonstrated that the dosage sensitivity of one DSG, 
CDC 14, arose from a dosage imbalance against NET1 (Kaizu et al. 



2010) . We also demonstrated a similar dosage balance between the 
GTPase gene spgl and its GAP byr4 in fission yeast (Moriya et al. 

2011) . 

To test the assumption that stoichiometry imbalance causes 
the toxicity of DSGs, we attempted to identify DSGs that are dosage 
balanced with their partner genes. We first created a list of po- 
tential dosage partners for DSGs using information about protein- 
protein interactions and their functional effects described in SGD 
(Supplemental Table S7). We then performed a series of experi- 
ments that examined whether the partner candidate could rescue 
the toxicity of individual DSGs as shown in Figure 7. A gTOW 
plasmid carrying DSG and another plasmid (pRS423ks) with the 
candidate partner were simultaneously introduced into yeast cells, 
and the cells were then grown under -Ura and -Leu-Ura condi- 
tions (Fig. 7A). If the candidate is the partner, then the toxicity of 
DSG is rescued and the cells can grow on -Leu-Ura plates. If both 
DSG and the partner are in dosage balance, then the copy numbers 
of both genes in survived cells must be conserved. The case of 
GLN3 (DSG) and URE2 (candidate partner) is shown as an example 
in Figure 7, B, C, and D. Among the 49 pairs tested, 13 were dem- 
onstrated to be in dosage balance (Supplemental Table S7; Sup- 
plemental Figs. Sll, SI 2). We note that previously suggested dos- 
age balance between tubulin genes TUB2 and TUB1 (Weinstein and 
Solomon 1990) were hardly detected in our experiment, and we 
detected the one between TUB2 and RBL2 (Supplemental Fig. SI 3). 
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GLN3 (pTOW) copy number 

Figure 7. Testing dosage balance between DSGs and their candidate partners. (A) The experimental design of 2D-gTOW to determine whether two 
genes are dosage partners (Kaizu et al. 201 0). First, we transformed a yeast strain with two plasmids expressing DSG and its candidate partner and then 
tested whether the transformant could grow under the -Leu condition and whether both the plasmids were balanced. (S,C,D) Examples of 2D-gTOW 
experiments with GLN3 (DSG) and its partner URE2. (B) Plate assay: High copy URE2 supports the growth of yeast cells with high-copy GLN3. (C) Copy 
numbers of pJO\N-GLN3 and pRS423ks-l//?£2 under the low-copy (-His-Ura) and high-copy (-His-Leu-Ura) conditions. (D) The copy numbers of GLN3 
and URE2 in 2D-gTOW experiments are balanced. Other experimental results can be found in Supplemental Figures S1 1, SI 2, and SI 3. 



Analyzed interactions and confirmed dosage-balanced interac- 
tions are indicated by bold lines and blue bold lines in Figure 5, 
respectively. We thus concluded that dosage imbalance was a cause 
of the dosage sensitivity of at least some yeast DSGs. 

Discussion 

In this study we applied gTOW to measure the CNLs of over- 
expression of nearly all protein-coding genes in S. cerevisiae and 
identified 115 DSGs with CNLs of 10 or less. From the character- 
istics of the genes (e.g., they tended to be highly expressed and 
complex members), we speculated that protein burden and stoi- 
chiometry imbalance caused the dosage sensitivity of these genes. 
We further experimentally verified the hypothesis using gTOW 
experiments. The results indicated that there are at least two dif- 
ferent causes of dosage sensitivity: specific and nonspecific causes 
related to gene function. We currently think that for some DSGs, 
the dosage imbalance by itself causes severe dosage sensitivities. 
We have isolated some DSGs where the dosage sensitivities were 
suppressed by the simultaneous overexpressions of their partners 
(Table 3). The copy numbers of these DSGs can increase (their 
proteins are further overexpressed) when their partners are abun- 
dant, and hence, their protein turnover does not appear to cause 
their dosage sensitivities. 

Disomy of any of the 16 5. cerevisiae chromosomes causes cel- 
lular growth defects resulting from the overexpression of particular 



genes on the disomic chromosome (Torres et al. 2007). Several 
possible mechanisms by which aneuploidy can cause cellular 
dysfunction have been proposed (Sheltzer and Amon 2011). Be- 
cause disomy causes the duplication of all genes on the chromo- 
some, it is difficult to identify specific genes, and consequently the 
specific mechanisms, causing dosage sensitivity. The mechanisms 
causing dosage sensitivity that were inspected in this study should 
have some shared features with aneuploidy. 

Although we focused on DSGs in this study, yeast cellular 
systems were robust against —100-fold overexpression in >80% of 
their genes (Fig. 2). According to the characteristics of DSGs found 
in this study, genes with low expression without dosage balance 
were conversely considered dosage insensitive. Genes with tightly 
controlled expression or enzymes with regulation that is not sub- 
unit dependent (e.g., regulated by intramolecular interactions) will 
be robust against copy number increase. The domain organization 
of proteins, e.g., a catalytic domain and a regulatory domain in the 
same protein, could have evolved to avoid dosage sensitivity. 

Why do DSGs remain in the present yeast genome? In addi- 
tion, why have not cellular systems evolved to avoid the existence 
of DSGs? One possibility is that dosage sensitivity has its own 
important function; if DSGs and their dosage partners are rea- 
sonably scattered around chromosomal regions, then they will 
constitute a dosage balance network (the network identified in this 
study is shown in Fig. 8). This network potentially constrains and 
secures the composition of an organism's chromosomes because 
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Table 3. Verified stoichiometric partners for DSGs a 



DSG 


Upper limit 


Partner 


Reference 


Interaction reported 


BFA1 


3.5 


TEM1 


Park et al. 2004 


Synthetic rescue 


CLN3 


1 .5 


URE2 


Palmer et al. 2009 


Synthetic rescue 


MYOl 


6.5 


MLC1 






MY02 


12.1 


MLC1 


Stevens and Davis 1 998 


Dosage rescue 


MY04 


6.5 


MLC1 






PPZ1 


0.3 


SIS2 


Clotet etal. 1999 


Dosage rescue 


PPZ1 


0.3 


VHS3 


de Nadalet al. 1998 


Synthetic rescue 


PPZ2 


9.3 


5/52 


BioGRID 


Physical interaction 


SEC4 


5.2 


5EC2 


Ortiz et al. 2002 


Dosage rescue 


TPK1 


0.9 


BCY1 


BioGRID 


Physical interaction 


TPK2 


2.1 


BCY1 


Nehlinetal. 1992 


Dosage rescue 


TPK3 


0.6 


BCY1 


Mazon etal. 1993 


Phenotypic enhancement 


TUB2 


2.7 a 


RBL2 


Abruzzi et al. 2002 


Phenotypic suppression 



a Complete data set is given in Supplemental Figures S1 1 , SI 2, and SI 3 and Supplemental Table S7. 



chromosomal abbreviation in a cell disrupts the balance within the 
network, which reduces the fitness of the cell. The reason why the 
genomic composition of current organisms is stable could be that 
the dosage balance network functions as a sentinel of abnormality. 
This could explain how and why the eukaryotic chromosomes 
were established and maintained during evolution in a relatively 
stable manner. If our hypothesis is true, the DSGs and their part- 
ners should be located on different chromosomes. In S. cerevisiae, 
all the DSGs and their partners identified in this study were ac- 
tually distributed on different chromosomes (Fig. 8). Analyzing 
the distributions of DSGs and their partners in species related to 
S. cerevisiae (before and after genome duplication) is one way of 
obtaining further evidence for this hypothesis. 

Methods 

Strains, growth conditions, and yeast transformation 

S. cerevisiae strain BY4741 (MATa his3Al leu2A0 metlSAO ura3A0) 
(Brachmann et al. 1998) was used for gTOW6000 analysis. Yeast 
cultivation and transformation were performed as previously 
described (Amberg et al. 2005). Synthetic complete (SC) medium 
without indicated amino acids were used for the cultivation of 
yeast. 

Plasmids used in this study 

pTOWug2-836 (Supplemental Fig. SI; Moriya et al. 2012) was used 
for gTOW6000 analysis. pTOW40836 (a pTOWug2-836 derivative 
but it does not contain the GFP gene in the backbone) (Moriya 
et al. 2012), was used for the GFP replacement experiments in 
Figure 6. pRS423ks, which was used to clone partner genes for 
two-dimensional gTOW experiments, is a derivative of pRS423 
(Christianson et al. 1992), and it has two additional primer sites 
outside the multicloning site (indicated as K_primer and S_primer 
in Supplemental Fig. SI 4). The K and S priming sites allowed us to 
selectively amplify the insert of pRS423ks from the cells harbor- 
ing pTOW and pRS423ks. gTOW6000 plasmid clones were con- 
structed as described below. The plasmids used for the frameshift 
analysis, the segmentation analysis, and the GFP replacement 
analysis were constructed as shown in Supplemental Figures SI 5, 
S16, and SI 7, respectively. Primer sequences used to construct the 
gTOW6000 plasmids are listed in Supplemental Table S8. Other 
primer sequences are available upon request. Individual plasmid in 
gTOW6000 is available from National BioResource Project-Yeast 
(http://yeast.lab.nig.ac.jp/). 



PCR 

All DNA fragments were amplified by 
PCR using the high-fidelity DNA poly- 
merase KODplus (Toyobo) according to 
the method described in the manufac- 
turer's protocol. 



DNA extraction and determination 
of the plasmid copy number 

DNA samples were prepared according 
to the method described previously 
(Moriya et al. 2006). The copy numbers 
of pTOWug2-836, pTOW40836 and 
pRS423ks were measured using real-time 
PCR according to the method described 
previously (Moriya et al. 2006; Kaizu 
et al. 2010) using Lightcycler480 (Roche). 
LEU2 (LEU2-2¥: 5 ' -GCTAATGTTTTGGCCTCTTC-3 ' ; LEU2-2R: 5'- 
ATTTAGGTGGGTTGGGTTCT-3 ') and HIS3 primer sets (HIS3-1V: 
5 '-TTCCGGCTGGTCGCTAAT-3 '; HIS3-1R: 5 '-GCGCAAATCCTG 
ATCCAAAC-3 ') were used to measure the copy numbers of pTOW 
vectors and pRS423ks, respectively. The LEU3 primer set (LEU3-3V: 
5 ' -CAGCAACTAAGGACAAGG-3 '; LEU3-3R: 5 ' -GGTCGTTAATG 
AGCTTCC-3') was used to amplify the genomic DNA. Because we 
used LEU3 as a reference gene for the genome in the copy number 
determination using real-time PCR, the calculated CNL of LEU3 is 
always one. 

Measuring GFP fluorescence 

GFP fluorescence of cell culture was measured using Infinite F200 
microplate reader (TECAN) 

Construction of gTOW6000 clones and the analysis 

The entire scheme of gTOW6000 analysis is shown in Figure 1. The 
gTOW6000 analysis was separated into eight steps as follows. 

Design primers to amplify each target gene (stepi), and amplify the target 
genes using PCR (step 2) 

In this study, we attempted to analyze all protein-coding genes on 
the S. cerevisiae chromosome. To clone all genes with their regu- 
latory regions for "Characterized" and "Uncharacterized" ORFs, we 
amplified a DNA fragment containing each target ORF with up- 
stream and downstream regions spanning the neighboring ORFs. 
We ignored "Dubious ORF/' autonomous replicating sequence 
(ARS), and other RNA elements. Supplemental Figure S2A presents 
an example of the analysis. Each region shown in blue was cloned 
into individual pTOW plasmids. It is thus possible that the plasmid 
CNL is determined by the effect of non-ORF elements within each 
clone instead of the cloned protein-coding genes. This possibility 
will be solved using a frameshift mutation analysis, as described in 
another section. Supplemental Figure S2B shows the design of the 
primers used to amplify the regions containing target genes by 
PCR. The primers consist of 23 -bp priming sequences of the 
neighboring ORFs and 25 -bp adaptor sequences of the vector for 
gap-repair cloning. The adaptor sequences of the up primer and 
the down primer were 5'-cggccgctctagaactagtGGATCC. . .-3' and 
5'-attgggtaccgggccccccCTCGAG. . .-3', respectively. The sequences 
shown in capital letters in the up and down primer sequences are 
the BamHI and Xhol sites, respectively. The primer sequences of 
pTOWug2-836 are shown in Supplemental Figure SIB. According 
to the annotation of SGD (released on July 28, 2007), primers for 
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Figure 8. Intrachromosomal interactions connected with DSGs and their partner genes. Locations of DSGs and their partner genes and their in- 
teractions identified in this study are visualized using Circos software (Krzywinski et al. 2009). The locations of 1 1 5 yeast DSGs are also shown. 



amplifying 5806 genes were designed using a Perl script. Each gene 
was amplified by PCR using each primer set and the BY4741 ge- 
nome as a template (first PCR). Via PCR, 98.4% of the obtained PCR 
products had the correct size. For the genes for which we could not 
obtain PCR products, we redesigned the primers. If the distance to 
the neighboring gene was too large, then we shortened the length 
of the noncoding region to 1 kb. If the target ORF was too large, we 
designed primers as listed in Supplemental Table S9 to amplify seg- 
ments of the gene and connected the segments by gap repair (see 
below). We thus redesigned primers for 90 genes. The primer sets for 
genes next to each of the 16 centromeres were first designed to 



ensure that the amplified fragments contain the centromeres. As 
expected, all 32 of the DNA fragments containing centromeres 
expressed one copy of the gTOW plasmid per cell (data not shown). 
We thus redesigned primers to remove the centromeres. 

Transformation [gap-repair cloning', step 3) and selecting two independent 
clones for each gene (step 4) 

The PCR products amplified using the aforementioned primers 
and pTOWug2-836 digested with BamHI and Xhol were simulta- 
neously introduced into BY4741 yeast cells. Each gene was inserted 
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via the homologous recombination activity of yeast cells (gap-re- 
pair cloning) (Oldenburg et al. 1997). Each transformed colony 
contained plasmids with an insert of the same target gene but an 
independent PCR product (or self-ligated plasmids without any 
insert). Two independent colonies (clones) were thus selected and 
cultivated in SC medium without uracil (SC-Ura). 

Measurement of growth (step 5) and measurement of plasmid copy 
numbers (step 6) 

Each clone was cultivated as described in step 4 in both SC-Ura and 
SC-Leu-Ura at 30°C. The max growth rate of the clone cultivated 
in SC-Leu-Ura was measured according to the method described 
previously (Moriya et al. 2006). Strains for which no growth was 
observed were assigned a growth rate of 0.1 for descriptive pur- 
poses. After 50 h of cultivation, the plasmid copy number in the 
cultured cells was measured. From the principle of gTOW, the 
plasmid copy number determined in -Leu-Ura condition is con- 
sidered to be the CNL of overexpression of each target gene. 

Validation of the inserts by PCR (Step 7) 

The insert of each clone was examined by PCR (insert-check PCR; 
icPCR) using primers OSBI0873 (5 '-GGCGAAAGGGGGATGTG 
CTG-3') and OSBI0870 (5 '-GGAAAGCGGGCAGTGAGCGC-3 ') 
(Supplemental Fig. SIB). The size of the insert was determined 
using Agarose gel electrophoresis. We validated the icPCR prod- 
ucts to ensure that the target genes were correctly cloned as fol- 
lows: "NI" meant the PCR product was the same size as the vector 
(No-Insert). In this case, we considered that the cloning was un- 
successful, and we did not adopt the max growth rate and copy 
number data. "N" meant No PCR product was amplified. "W" 
meant the PCR product had the wrong size (different from the 
expected size). "D" meant two PCR products were amplified. One 
of them had the expected size. In these cases, we adopted the max 
growth rate and copy number data because it was possible that 
there were problems with icPCR (e.g., the target was too large). We 
obtained two independent clones for 88.9% of the genes in the 
first cycle. 

Isolation of missing clones (step 8) 

For genes for which we could not obtain two clones in step 7, we 
redesigned primers as described in step 1 or selected more colonies 
as described in step 4. We finally obtained two clones for 5548 
genes (95.6%) and one clone for 203 genes (3.5%). We could not 
obtain any positive clones for 55 genes (5.5%). 

Genes that were difficult to clone 

We could not obtain any positive clones for YFL037W/TUB2 and 
YFL039C/ACT1, probably because they are too toxic. We thus 
made plasmids with those genes in Escherichia coli and confirmed 
that they were too toxic for the transformants to form colonies 
(data not shown). We thus concluded that they were very low limit 
genes. In addition, for TUB2, we created a promoter-deletion series 
and obtained a TUB2 allele with a 100-bp promoter (tub2d-100, its 
CNL was 2.7). We thus used these data for TUB2. As mentioned 
above, we could not obtain any clones for 55 genes. Approximately 
half of them were retrotransposons and helicases encoded near 
telomeres. 
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